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ABSTRACT 

Modeling  and  Simulation  is  an  important  tool  in  the  development  of  the  highly  effective  weapons  systems  built  by 
the  United  States  and  its  allies.  However,  recent  initiatives  to  reduce  the  cost  of  weapon  systems  through  expanded 
use  of  modeling  and  simulation  during  the  development  process  have  not  always  lived  up  to  expectations.  Current 
practice  in  the  construction  of  models  and  simulations  primarily  uses  a  manual  implementation  of  equations  to 
describe  the  entity  being  modeled.  After  verifying  correct  operation,  these  models  are  then  validated  by  comparing 
them  to  data  from  real  world  tests  to  insure  accuracy.  These  equation-based  models  require  extensive  time  and 
money  in  order  to  construct  high  fidelity  models  that  accurately  represent  the  real  world.  Our  research  explores  an 
alternate  method  of  creating  accurate  models  and  simulations  that  can  be  done  rapidly  and  at  much  lower  cost.  This 
approach  uses  hybrid  artificial  intelligence  to  create  the  models  and  simulations  directly  from  validation  data  sets. 
Test  results  using  this  method  of  modeling  militarily  representative  systems  such  as  wing  lift,  radar,  and  Forward 
Looking  Infrared  (FLIR)  demonstrated  a  reduction  of  over  90%  in  human  labor  required  to  create  the  models  while 
simultaneously  achieving  approximately  70%  better  accuracy  as  compared  to  equation-based  models  prior  to 
validation.  Because  this  method  builds  the  models  from  a  data  set,  the  method  can  be  used  to  construct  models  of 
activities  such  as  human  decision-making  that  cannot  be  described  using  an  equation-based  approach.  Additionally, 
the  research  demonstrated  that  models  created  using  this  method  could  be  fully  integrated  with  existing  equation- 
based  models.  This  research  has  the  potential  to  dramatically  improve  the  war-fighting  capability  of  the  United 
States  and  its  allies  by  providing  a  fast,  inexpensive  method  to  model  any  entity  for  which  data  are  available. 
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Public  reporting  burden  for  the  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions,  searching  existing  data  sources,  gathering  and 
maintaining  the  data  needed,  and  completing  and  reviewing  the  collection  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this  collection  of  information, 
including  suggestions  for  reducing  this  burden,  to  Washington  Headquarters  Services,  Directorate  for  Information  Operations  and  Reports,  1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington 
VA  22202-4302.  Respondents  should  be  aware  that  notwithstanding  any  other  provision  of  law,  no  person  shall  be  subject  to  a  penalty  for  failing  to  comply  with  a  collection  of  information  if  it 
does  not  display  a  currently  valid  OMB  control  number. 
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INTRODUCTION 

Modeling  and  Simulation  (M&S)  is  an  important  tool 
for  performing  trade  studies  in  systems  engineering. 
M&S  provides  designers  with  the  ability  to  examine  a 
large  number  of  virtual  designs  before  constructing  a 
prototype  or  system.  This  provides  a  variety  of 
benefits,  including  balancing  requirements  with 
available  funding  and  schedule,  determining  risk  areas, 
building  efficient  test  plans  and  reducing  test 
requirements.  The  aggressive  use  of  modeling  and 
simulation  is  one  of  the  few  tools  that  have 
demonstrated  the  simultaneous  achievement  of  a  better 
product  brought  to  market  in  less  time  at  a  lower  cost 
[DTSE&E,  1996], 

Attaining  these  benefits  currently  requires  an  extensive 
up-front  investment.  In  many  cases,  small  programs  do 
not  have  the  resources  to  make  this  investment 
[DTSE&E,  1996].  Reducing  the  cost  of  modeling  and 
simulation  so  that  it  becomes  affordable  for  use  in 
smaller  programs  and  product  developments  would 
represent  a  substantial  benefit  to  product  and  system 
development.  Our  research  investigates  reducing  the 
high  cost  and  length  of  time  required  to  build  models 
by  introducing  an  alternative  hybrid  artificial 
intelligence  method  that  creates  models  from  data  sets. 
These  models  are  then  compared  in  both  time  of 
construction  and  accuracy  to  the  current  equation- 
based  modeling  technique. 

Benefits  and  Costs  of  Modeling  and  Simulation 

As  computer  power  continues  to  increase,  model 
builders  are  able  to  build  increasingly  more  complex 
and  accurate  virtual  representations  of  real-world 
entities  [Zittel,  1998].  These  have  provided  impressive 
improvements  in  product  quality,  reductions  in  time  to 
develop  products  and  lower  product  costs.  Table  1 
provides  a  summary  of  some  documented 
improvements  from  a  study  of  the  use  of  modeling  and 
simulation  in  both  the  public  and  private  sectors. 


Table  l1.  Measured  Benefits  of  Modeling  and 
Simulation 


Who 

What 

Traditional 

Method 

New  Method  with 

M&S 

TRW 

Radar 

Warning 

System 

Design 

96  man- 
months 

46  man-months 

TARDEC 

BFV 

Engineering 
and  Analysis 

4-6  man- 
months 

0.5  man-months 

TARDEC 

Low 

Silhouette 
Tank  Design 

55 

engineers  - 
3  years 

14  engineers  -  16 
months 

General 

Electric 

Engine  Fan 
Blade 

4  weeks 

A  few  hours 

Lockheed 

Martin 

Engineering 

Mock-ups 

2100  hours 

900  hours 

Lockheed 

Martin 

Changes  per 
Final 
drawing 

4 

2 

Lockheed 

Martin 

Physical 

Mock-ups 

$30M  each 

None 

Lockheed 

Martin 

Design 

Verification 

Baseline 

30%  -  50% 
reduction 
from  baseline 

IBM 

Computers 

1 0,000  parts 

4  years 

4000  parts 

2  years 

Motorola 

Cellular 

devices 

Baseline 

50%  reduction  in 
product  cycle  time 

Sikorsky 

Aircraft 

Helicopter 

External 

Working 

Drawings 

38 

draftsmen 

6  months 

1  engineer 

1  month 

NAVSEA 

Ship 

Seakeeping 

Analysis 

27  days 

3.5  days 

NAVSEA 

Radar  Cross 
Section 
Analysis 

57  days 

1 7  days 

Comanche 

Helicopter 

Program 

Source 

Selection 

Prototype 

Fly-off 

S500M 

Simulator/Surrogate 
Aircraft  Fly-off 
$20M 

As  can  be  seen  from  these  examples,  most  of  the 
success  stories  found  during  this  study  involved  large 
government  programs  and/or  products  developed  by 


1  DTSE&E  study  (1996)  “Study  on  the  Effectiveness 
of  Modeling  and  Simulation  in  the  Weapon  System 
Acquisition  Process”,  Final  Report. 
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large  corporations.  Across  complex  Department  of 
Defense  (DoD)  programs  a  conservative  average  cost 
benefit  of  operating  costs  for  virtual/constmctive 
training  over  live  training  has  been  at  least  20:1.  A 
simulated  joint  exercise  led  by  NAVAIR  Orlando  at 
FITSEC  2003  again  validated  this  ratio.  The  cost 
advantage  would  have  been  at  least  30:1  if  the  cost  of 
precision  munitions  and  environmental  costs  were  also 
included.2 3 

Although  the  benefits  of  Modeling  and  Simulation  are 
significant,  these  benefits  come  at  a  steep  price.  An 
aggressive  M&S  effort  requires  an  extensive  up  front 
investment.  For  the  virtual  exercise  at  I/ITSEC  2003 
costs  ran  between  $300-400K  for  the  two-day  event, 
and  had  the  advantage  of  millions  of  dollars  of  R&D 
supporting  the  products.  Development  of  the  Boeing 
777,  a  recognized  business  success  case  in  which  M&S 
played  a  significant  role,  required  an  up  front 
investment  of  roughly  one  hundred  million  dollars 
[Garcia,  et.  al.,  1994].  The  M&S  core  body  of 
knowledge  states  under  limitations  that  “M&S  tools  are 
not  generally  inexpensive  and  require  an  up-front 
investment  cost”  [Acquisition  Functional  Working 
Group,  1999].  This  statement  is  backed  up  by  results 
of  a  study  looking  at  the  cost  of  the  M&S  effort  on 
Department  of  Defense  programs  summarized  in  table 
2. 


Table  23.  Department  of  Defense  M&S  Cost  Data 


Program 

Approximate 
Total  Program 
Cost 

M&S 

Expenditures 

LPD-17  (ship) 

$10B 

$38M 

ATACMS/BAT 

(munition) 

$5B 

$25.2M 

Javelin  (missile) 

$4B 

$48M 

AN/BSY-2  (sonar) 

$3B 

$58.3M 

This  same  study  found  that  program  managers  do  not 
consider  DoD-wide  M&S  investments  as  either  cost  or 
schedule  effective  [Flicks  &  Associates,  Inc.,  2001]. 


2  Data  provided  courtesy  of  Northrop  Grumman 
Corporation 

3  Flicks  &  Associates,  Inc.,  (2001)  “Modeling  and 
Simulation  Survey  Briefing”. 


Why  is  M&S  so  Expensive? 

Looking  at  the  modeling  of  a  simple  system 
demonstrates  the  high  cost  and  time  associated  with 
building  equation-based  models,  even  for  systems  that 
have  well  understood  equations.  One  particular  case 
evaluated  the  building  of  a  model  for  simulating  the 
performance  of  a  spring-powered  car  [Brown,  1999]. 
The  model  was  constructed  for  use  by  students  taking  a 
course  in  systems  engineering  at  the  Defense  Systems 
Management  College  and  was  designed  to  demonstrate 
the  value  of  modeling  and  simulation  in  cost- 
performance  trades.  The  exercise  involved  conducting 
a  series  of  trades  to  find  a  combination  of  variables  that 
provided  good  performance  for  only  two  performance 
requirements  at  the  lowest  cost.  The  final  equation  of 
motion  for  this  simple  vehicle  had  34  variables  and  8 
coefficients.  Modeling  and  simulation  of  complex 
systems  may  require  an  extremely  large  number  of 
variables  and  coefficients  as  well  as  the  equations  that 
relate  them  together.  It  is  highly  unlikely  that  a 
company  making  spring-powered  cars  could  afford 
even  a  tiny  fraction  of  the  costs  in  table  2.  The  study 
compared  a  group  of  students  who  used  the  model  with 
a  control  group  that  did  not  have  access  to  the  model. 
The  use  of  M&S  in  the  design  phase  resulted  in  better 
performance  at  lower  cost  for  the  same  amount  of  time 
spent  on  the  project  [Brown,  1999].  The  benefits  of 
using  M&S  in  the  design  phase  of  any  project, 
regardless  of  size,  are  significant. 

Once  any  model  is  built,  it  must  be  verified  and 
validated  before  use  [Acquisition  Functional  Working 
Group,  1999].  Verification  tests  that  the  model  has 
been  implemented  correctly,  while  validation  checks 
that  the  model  or  simulation  accurately  represents  the 
real  world  system.  To  correctly  validate  a  model,  the 
actual  system  is  tested  over  the  range  of  values  that  the 
model  or  simulation  is  intended  for  use.  The  model 
predictions  are  checked  against  the  test  data.  If  the 
model  does  not  agree  within  specified  limits  in  any 
area  with  the  test  data,  further  tests  are  conducted  to 
determine  the  cause  of  the  difference.  These  causes 
are  then  mathematically  incorporated  into  the  model 
and  the  results  checked  again.  This  process  continues 
until  an  acceptable  agreement  between  the  test  data  and 
the  model  is  obtained.  This  explains  the  finding  in  the 
M&S  Core  Body  of  Knowledge  which  states  that 
attempts  to  create  high-fidelity  models  rapidly  drive  up 
the  cost  of  a  modeling  effort  [Acquisition  Functional 
Working  Group,  1999].  The  key  to  wider  use  of  M&S 
in  product  development  is  to  significantly  reduce  the 
time  and  expense  of  current  modeling  methods. 
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Quantifying  Predictive  Accuracy 

Another  issue  with  current  methods  of  modeling  and 
simulation  is  quantifying  the  predictive  accuracy. 
Models  and  simulations  only  approximate  the  real 
world.  Most  models  provide  a  single  predictive  output 
for  a  single  set  of  inputs.  Instead  of  a  single 
prediction,  a  better  solution  would  be  to  provide  the 
range  of  values  over  which  the  true  answer  would  lie 
and  identify  which  values  are  more  likely  than  others. 

Many  current  M&S  software  packages  provide  a 
sensitivity  analysis  feature.  A  sensitivity  analysis 
varies  the  independent  variables  over  their  expected 
range  of  values  in  the  anticipated  operational 
environment  to  determine  the  sensitivities  (or 
gradients)  with  respect  to  the  dependent  variables  of 
interest  [Arsham,  2002].  The  model  or  simulation  is 
run  multiple  times  with  the  variable  on  which  the 
analysis  is  being  performed  incremented  by  a  fixed 
amount  on  each  run.  The  analysis  begins  at  either  the 
highest  or  the  lowest  value  of  the  sensitivity  range  and 
continues  until  the  opposite  end  of  the  range  is 
reached.  A  sensitivity  analysis  provides  a  more 
complete  answer  by  specifying  a  range  over  which  the 
answer  may  lie.  However,  this  answer  is  incomplete  in 
that  it  provides  no  information  about  where  within  the 
range  the  answer  is  most  likely  to  fall.  This  analysis  is 
sufficient  in  estimating  model  sensitivities  only  if  the 
effects  of  the  parameters  on  the  model  are  independent 
and  monotonic  [Bankes,  1993].  No  probability  or 
confidence  can  be  attached  to  the  range  of  even  the 
most  sensitive  variable.  Furthermore,  variables  that 
show  little  sensitivity  when  varied  independently  may 
exhibit  strong  sensitivity  when  varied  in  combination 
with  other  variables.  Thus,  running  a  sensitivity 
analysis  may  not  capture  the  true  range  of  the  solution 
space  of  the  dependent  variables. 

The  most  complete  method  found  was  the  modeling  of 
all  variables  that  exhibit  variation  as  random  variables 
[Bankes,  1993].  Each  random  variable  is  set  to  a 
distribution  function  that  describes  how  the  variable 
behaves.  A  Monte  Carlo  method  is  then  employed 
which  samples  from  each  distribution  over  multiple 
computer  runs  to  obtain  a  probability  distribution  of 
the  dependent  variables.  This  provides  a  complete 
probabilistic  solution  to  the  problem  in  that  all 
correlations  and  synergistic  effects  are  captured,  the 
complete  range  of  possible  outputs  is  captured,  and  the 
likelihood  of  each  answer  within  the  range  is  specified. 
The  accuracy  of  the  output  distribution  is  dependent 
only  on  the  number  of  samples  generated  and  is  not 
dependent  on  the  number  of  inputs.  The  Monte  Carlo 
method  also  allows  use  of  standard  statistical 


techniques  to  estimate  the  precision  of  the  output 
distribution.  Although  this  method  provides  a  more 
complete  answer  to  predictive  accuracy,  its  primary 
drawback  is  that  it  can  take  vast  amounts  of  computer 
run  time  in  order  to  generate  the  distribution  if  the 
model  is  large  and  complex. 

TECHNICAL  APPROACH 

To  reduce  the  cost  of  constructing  models  and 
simulations,  the  research  approach  focused  on  reducing 
the  amount  of  human  labor  in  the  model  building 
process.  This  was  accomplished  by  using  artificial 
intelligence  agents  to  learn  the  relationships  between 
variables  directly  from  data  sets  creating  computer¬ 
generated  models.  Although  this  approach  is  not  new, 
the  exact  technical  approach  is  unique  in  that  a  hybrid 
software  package  using  both  Bayesian  and  neural 
networks  was  created  to  conduct  the  research.  This 
approach  overcomes  many  limitations  associated  with 
curve  fitting,  which  can  not  easily  handle  non-linear  or 
discontinuous  data  sets. 

Bayesian  Networks 

Bayesian  networks  are  directed  graphs  for 

representing  probabilistic  dependencies  among 

variables  [Jensen,  1996].  Bayesian  networks  encode  a 
complete  and  coherent  probability  distribution  over 
many  variables  and  can  be  used  to  evaluate  both  causal 
and  evidential  influences.  A  Bayesian  network 

consists  of  a  directed  acyclic  graph  that  represents 
dependencies  among  variables,  together  with  local 
probability  distributions  defined  for  small  clusters  of 
directly  related  variables.  Directed  acyclic  graphs 
consisting  of  nodes,  which  represent  the  variables,  and 
arcs  (or  directed  edges)  that  describe  cause  and  effect 
relationships  or  statistical  associations  between  the 
variables  [Jensen,  1996].  Each  variable  has  a  finite  set 
of  mutually  exclusive  states.  The  graph  may  contain 
no  directed  cycles,  or  paths  that  lead  from  a  node  to 
itself  and  follow  the  direction  of  the  arcs.  Each  node  is 
conditionally  independent  of  its  non-descendents  given 
its  parents. 

Probability  information  in  a  Bayesian  network  is 
specified  via  a  local  distribution  for  each  node.  The 
local  distribution  for  a  root  node  is  simply  an 
assignment  of  a  probability  to  each  state  such  that  the 
probabilities  sum  to  1 .  A  conditional  probability  table 
gives  the  local  distribution  for  a  child  node.  This  table 
shows  the  probability  of  each  possible  state  of  the  child 
conditional  on  each  possible  state  of  all  of  its  parents. 
The  joint  distribution  for  all  variables  in  the  network  is 
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given  by  the  product  of  the  local  distributions  for  all 
the  nodes: 


P (X Xn  =Y\  P(X,  \XpaW)  (1) 

where  Xpa(i)  denotes  the  parents  of  variable  X,. 

The  conditional  probability  that  variable  E  takes  on 
value  e  given  that  H  takes  on  value  h  is  defined  by  the 
equation: 

P(E  =  e\H  =  h)  =  P(E  =  e  and  H  =  h)  (2) 

A  straightforward  consequence  of  this  definition  is 
Bayes  Rule,  a  powerful  mathematical  relationship  by 
which  probabilities  can  be  modified  to  incorporate  new 
evidence: 

P(H  |  E)  =  P(H)  *  P(E  |  H)  /  P(E)  (3) 

The  first  term,  P(H|E)  is  referred  to  as  the  “posterior 
probability”  or  the  probability  of  H  given  evidence  E. 
The  term  P(H)  is  the  prior  probability  of  H.  The  term 
P(E|H)  is  the  “likelihood”  and  gives  the  probability  of 
the  evidence  assuming  hypothesis  H  is  true.  The  last 
term  is  the  probability  of  E  that  acts  as  a  normalizing 
or  scaling  factor  [Niedermayer,  1998]. 

To  demonstrate  a  Bayesian  network,  an  example  for 
diagnosing  problems  with  the  air  conditioning  of  any 
car  using  R-134a  refrigerant  is  shown  in  figure  1. 


Figure  1.  A/C  Bayesian  Network  Model 

In  this  example,  diagnosing  the  system  involves  taking 
pressure  readings  from  the  high  pressure  output  line 
from  the  compressor  and  the  low  pressure  return  line  to 
the  compressor.  The  readings  are  then  evaluated  as  to 


whether  they  are  high,  normal,  or  low  based  on  the 
outside  air  temperature.  This  information  then 
determines  the  most  likely  status  of  the  system.  In 
figure  1,  if  the  low  side  pressure  is  32  psi,  the  high  side 
pressure  is  198  psi  and  the  outside  temperature  is  84 
degrees  F,  then  the  system  is  not  operating  normally 
and  the  most  likely  problem  is  that  the  refrigerant  level 
is  low. 

The  advantages  of  Bayesian  networks  include  the 
capability  to  learn  both  the  structure  of  the  networks 
and  the  probabilistic  relationships  between  the  nodes 
from  data  sets.  Through  the  use  of  Bayes  Rule  in 
calculating  the  distributions,  networks  respond  nearly 
instantaneously  to  node  state  inputs.  The  output  is  a 
probability  distribution  that  provides  not  only  the  range 
of  values  over  which  the  answer  may  lie,  but  also  the 
probability  of  each  answer  within  the  range.  Bayesian 
networks  can  also  provide  these  distributions  with  an 
incomplete  set  of  input  parameters.  The  principle 
disadvantage  of  these  networks  is  that  they  cannot 
always  provide  predictions  to  inputs  that  were  not  in 
the  learning  data  set. 

Neural  Networks 

Neural  networks  are  computational  systems  that  mimic 
the  computational  abilities  of  biological  systems  by 
using  large  numbers  of  simple,  interconnected  artificial 
neurons  [Maren  et  al.,  1990]  There  are  different  types 
of  neural  network  applications  available  for 
consideration.  They  fall  into  five  basic  categories: 
prediction,  classification,  data  association,  data 
conceptualization  and  data  filtering.  The  primary  use 
of  a  neural  network  in  this  research  is  for  prediction. 
Types  of  predictive  neural  networks  include  the  back- 
propagation,  delta  bar  delta,  extended  delta  bar  delta, 
directed  random  search,  higher  order  or  functional  link, 
and  the  self-organizing  into  back-propagation 
[Anderson  and  McNeil,  1992].  A  feed-forward  back- 
propagation  networks  (usually  referred  to  as  the  back- 
propagation  networks)  was  selected  for  use  in  this 
research.  A  neural  network  of  this  type  contains  an 
input  layer,  one  or  more  hidden  layers  and  an  output 
layer.  A  typical  back-propagation  neural  network  is 
shown  in  figure  2.  The  input  layer  nodes  feed  the  input 
values  into  the  rest  of  the  network.  Connections 
between  layers  are  bi-directional.  Data  values  move 
from  inputs  through  the  hidden  layers  to  the  outputs 
during  feed  forward  operation.  During  learning,  error 
corrections  are  propagated  back  through  the  network 
starting  from  the  output  nodes  and  running  upward 
through  all  hidden  nodes  from  the  bottom  to  the  first 
hidden  layer. 
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Figure  2.  Example  Neural  Network 

All  hidden  and  output  nodes  in  the  network  have  the 
structure  shown  in  figure  3 . 


Figure  3.  Neural  Network  Node 

During  feed  forward  operation,  the  node  first  calculates 
the  sum  of  all  inputs  (I)  times  their  weights  (W).  A 
transfer  function  is  then  applied  to  the  sum.  This 
function  transforms  the  output  into  a  number  between 
zero  and  one  (minus  one  and  one  in  some  software 
packages).  There  are  several  transfer  functions  that 
can  be  used.  All  functions  have  a  ramp,  bell  or 
modified  S-shaped  curve  that  runs  asymptotically 
along  the  X-axis  approaching  either  the  maximum  or 
minimum  value  [Maren  et  al,  1990].  The  type  of 
transfer  function  is  manually  selected  during  network 
construction  while  the  weights  for  each  input 
connection  is  calculated  during  the  learning  process. 
All  inputs  must  also  be  scaled  to  values  between  zero 
and  one.  Outputs,  which  are  all  values  between  zero 
and  one,  must  be  scaled  in  the  reverse  direction  from  a 
decimal  value  to  the  actual  value. 

The  primary  advantage  of  a  neural  network  is  that  it  is 
capable  of  adaptive  learning  of  very  complex  problems 


[Maren  et  al.,  1990].  These  networks  can  predict 
additional  values  within  the  range  of  the  training  data 
set.  Neural  networks  can  also  handle  both  non-linear 
and  non-continuous  functions.  The  disadvantages  of 
neural  networks  include  a  single  output  predictive 
answer  with  no  information  of  how  probable  or 
accurate  that  answer  may  be  and  the  requirement  for  a 
complete  set  of  inputs. 

Flybrid  Networks 

Use  of  multiple  types  of  artificial  intelligence  networks 
at  the  same  time  is  currently  an  area  of  high  interest  to 
researchers.  The  Northrop  Grumman  Corporation 
(NGC)  has  used  combinations  of  networks  for  data 
fusion  and  to  handle  uncertainty  in  highly  complex 
data  problems  with  high  levels  of  uncertainty.  Hybrid 
networks  are  emerging  from  NGC  work  with  Applied 
Minds  under  a  program  called  Futures  Lab.  This 
approach  uses  Bayesian  networks,  along  with  other 
types  of  artificial  intelligence  networks,  to  fuse 
evidence  at  the  hypotheses  while  using  neural 
networks  to  reconcile  the  network  outputs. 

Research  Software  Implementation 

To  conduct  the  research,  a  software  package  capable  of 
creating  Bayesian  network  models  from  data  sets  was 
required.  A  search  of  existing  applications  found  no 
software  package  suitable  for  creating  engineering 
models  from  data  sets  containing  mixtures  of  discrete 
and  continuous  variables.  The  primary  deficiency  was 
the  absence  in  currently  available  packages  of  methods 
for  intelligent,  simultaneous  discretization  of  multiple 
continuous  variables.  This  led  to  the  development  of 
the  derivative  method  of  discretization  that  is 
implemented  in  the  research  software  package 
described  below.  The  software  package,  BN  Builder, 
integrates  new  code  to  implement  the  discretization  of 
continuous  variables  with  four  commercial  software 
packages  providing  the  rest  of  the  functionality.  The 
software  architecture  and  data  flow  are  presented  in 
figure  4.  The  input  to  the  software  is  an  Microsoft !' 
Excel  data  set.  A  neural  network  is  manually  created 
using  QNET  2000.  The  weights  are  learned  from  the 
input  data  set.  Bayesian  network  structure  learning  is 
performed  by  BN  PowerConstructor,  one  of  the 
modules  in  the  BN  PowerSoft  collection  by  Jie  Cheng 
of  the  University  of  Alberta,  Canada. 


2005  Paper  No.  2009  Page  7  of  13 


Interservice/Industry  Training,  Simulation,  and  Education  Conference  (I/ITSEC)  2005 


Figure  4.  Research  Software  Architecture 

The  neural  network  generates  additional  data  for  the 
learning  data  set  based  on  user  input.  The  BN  builder 
software  creates  a  Bayesian  network  using  the 
augmented  data  set.  Nodes  with  continuous  data 
values  are  discretized  using  the  derivative  method. 
Additional  uncertainty  from  the  neural  network 
predictions  is  captured  during  learning  by  comparing 
neural  network  predictions  with  the  training  data  and 
adding  the  additional  variance  into  the  node  probability 
distribution.  The  output  of  the  research  software  is  a 
Bayesian  network  that  is  capable  of  making 
probabilistic  predictions  to  incomplete  input  data  and 
that  can  make  predictions  to  inputs  not  contained 
within  the  learning  data  set. 

METHODOLOGY 

The  methodology  used  to  conduct  the  research 
consisted  of  comparing  models  and  simulations 
constructed  using  a  conventional,  manual  equation- 
based  implementation  with  computer-generated  models 
and  simulations  created  using  the  software  described 
above.  Equation-based  models  were  constructed  using 
mathematical  equations  from  published  textbooks.  The 
same  individual  constructed  all  but  one  model  with 
construction  time  recorded  to  the  nearest  minute.  A 
complete  description  of  all  models,  tests  and  results 
can  be  viewed  at  https://acc.dau. mil/aicrms.  The 
models  used  in  the  research  are  listed  in  table  3. 
Models  were  evaluated  at  the  first  step  of  the  validation 
process.  The  computer-generated  models  were 
constructed  by  dividing  the  validation  test  data  into  a 
learning  set  and  a  test  set  of  data  points.  The  test  set 
always  contained  input  conditions  not  contained  within 
the  learning  set. 


Table  3.  Research  Model  Matrix 


Model 

Name 

1 

Amplifier 

2 

LRC  electrical  circuit 

3 

Elevator  control 

4 

Radar 

5 

Forward  Looking  Infrared  (FLIR) 

6 

Commuter 

7 

Wing  Lift 

The  computer-generated  model  was  constructed  from 
the  learning  data  set,  and  then  used  to  predict  the 
outputs  of  the  test  data  set.  The  predictions  were 
compared  to  the  test  data  for  accuracy  using  the 
percent  difference  between  the  prediction  and 
measurement  as  the  accuracy  metric.  The  equation- 
based  models  were  used  to  predict  the  same  outputs  of 
the  test  data  set. 

Hypothesis  #1  -  Time  of  Construction 

Null  hypothesis  H^:  There  is  no  difference  in 

construction  time  between  computer-generated  models 
and  equation-based  models. 

Alternate  hypothesis  H :  There  is  a  difference  in 
construction  time  between  computer-generated  models 
and  equation-based  models. 

Hypothesis  #2  -  Model  Accuracy 

2 

Null  hypothesis  H  n :  There  is  no  difference  in 

predictive  accuracy  between  computer-generated 
models  and  equation-based  models. 

2 

Alternate  hypothesis  H  4 :  There  is  a  difference  in 
predictive  accuracy  between  computer-generated 
models  and  equation-based  models. 

RESULTS 

An  equation-based  model  and  a  computer-generated 
model  were  constructed  for  each  system  listed  in  table 
3.  The  wing  model  was  not  included  in  the  time 
comparison  as  it  was  constructed  by  an  outside  source 
with  no  record  of  construction  time.  A  comparison  of 
construction  times  is  included  in  figure  5.  As  can  be 
seen  in  figure  5,  the  computer-generated  models  were 
constructed  in  less  time  than  the  manually  constructed 
equation  models  in  all  cases  except  the  amplifier.  This 
was  due  to  a  unique  case  where  the  modeling  software 
package  came  with  a  pre-built  amplifier  element. 
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Figure  5.  Model  Construction  Time 


Hypothesis  H1  was  tested  with  the  data  of  figure  5 
resulting  in  a  rejection  of  the  null  hypothesis.  The 
computer-generated  models  took  less  time  to  construct 
in  five  of  six  cases  and  overall  took  an  average  of  one 
fifth  the  time  to  construct  as  compared  to  equation- 
based  models.  The  difference  is  so  great  in  these  five 
cases  that  it  supports  a  conclusion  that  construction 
time  for  computer-generated  models  is  less  than 
equation-based  models  at  95%  confidence. 

The  cost  of  modeling  and  simulation  is  driven  mostly 
by  the  human  labor  involved  in  the  process.  Although 
computer  equipment  and  software  require  upfront 
investments,  the  cost  of  computer  run  time,  once 
purchased,  is  negligible.  The  average  times  to  perform 
specific  tasks  while  constructing  the  models  are 
presented  in  figures  6  and  7. 


Figure  6:  Equation-based  Model  Task  Times 

As  can  be  seen  in  figures  6  and  7,  not  only  has  the 
average  time  of  construction  been  reduced  from  136  to 
26  minutes,  but  the  task  loading  requiring  human  work 
has  been  reduced  from  100%  in  the  equation-based 
models  to  47%  for  the  computer-generated  Bayesian 
network  models  resulting  in  a  total  reduction  in  human 
labor  of  over  90%. 


Figure  7:  Computer-generated  Model  Task  Times 

Reviewing  the  breakdown  of  model  task  times,  as 
model  complexity  increases,  total  construction  time 
increases.  However,  the  human  tasks  associated  with 
model  construction  remains  nearly  constant.  Learning 
the  network  structure  and  constructing  the  neural 
network  both  require  human  input,  but  are  computer- 
aided  tasks.  The  increase  in  construction  time  is 
almost  completely  attributable  to  increased  computer 
run  time  of  the  BN  Builder  program.  This  leads  to  a 
conclusion  that  models  created  using  computer 
generated  Bayesian  networks  would  be  much  less 
expensive  to  build  than  equation-based  models.  Not 
only  is  time  of  construction  significantly  less,  but  the 
human  labor  involved  is  also  reduced.  Because  there  is 
no  longer  a  strong  relationship  between  complexity  and 
human  labor  required,  costs  to  construct  computer¬ 
generated  Bayesian  networks  are  not  sensitive  to 
problem  complexity. 

Hypothesis  H2  was  tested  using  thirteen  cases 
generated  from  six  of  the  models.  The  elevator  control 
model  was  not  included  as  it  had  a  discrete  output. 
The  control  system  sent  the  elevator  to  the  correct  floor 
in  all  cases  for  both  types  of  models.  A  summary  of 
model  errors  is  presented  in  figure  8. 


Figure  8.  Model  Error  Comparison 
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Hypothesis  H"  is  tested  with  the  data  of  figure  8 
resulting  in  a  rejection  of  the  null  hypothesis.  The 
error  associated  with  computer  constructed  Bayesian 
network  models  is  lower  in  all  twelve  cases  and 
averaged  over  70%  less  as  compared  with  the 
equation-based  models.  The  magnitude  of  the 
difference  in  error  is  great  enough  to  establish  a 
statistical  difference  between  these  two  methods  at 
95%  confidence. 

Two  models  that  demonstrate  the  unique  capabilities  of 
computer-generated  models  are  the  wing  aerodynamics 
and  Forward  Looking  Infrared  (FLIR)  models.  The 
aerodynamics  of  an  airfoil  such  as  a  wing  are  described 
by  the  Navier-Stokes  equations.  Unfortunately,  even 
with  the  vast  power  of  today’s  computers,  for  problems 
of  interest  the  full  Navier-Stokes  equations  are  still  too 
expensive  to  solve,  instead,  the  equations  must  be 
solved  through  approximations  using  numerical 
methods  referred  to  as  Computational  Fluid  Dynamics 
(CFD).  One  such  method  is  a  panel  code  model 
developed  at  the  Naval  Postgraduate  School  which 
assumes  that  the  airflow  is  inviscid,  incompressible  and 
irrotational.  As  can  be  seen  in  figure  8  for  a  thin  wing 
such  as  the  NACA  1412  airfoil,  these  assumptions  are 
valid  and  the  results  are  accurate.  However,  for  a 
thicker  wing  such  as  the  NACA  4421,  a  loss  of  lift  (CL) 
occurs  at  higher  angles  of  attack  (AOA)  due  to  a 
breakdown  of  at  least  one  assumption  as  can  be  seen  in 
figure  9.  The  computer-generated  models,  by 
comparison,  are  able  to  leam  from  the  data  set  that 
there  is  a  loss  of  lift  at  higher  angles  of  attack  for 
thicker  wings  and  are  therefore  able  to  make  a  much 
more  accurate  prediction  of  Cl. 


♦  Equation 
X  Computer 
- Test  Data 


Figure  9:  Wing  Model  Comparison 

The  second  example  of  special  interest  is  the  FLIR 
models.  The  test  data  for  the  FLIR  in  its  Wide  Field  of 
View  (WFOV)  setting  for  the  detection  range  versus 
temperature  differential  between  a  fixed  size  test  target 
and  the  background  is  shown  in  figure  10. 


Figure  10:  WFOV  FLIR  Test  Data 


The  data  were  measured  by  multiple  students  at  the 
Naval  Test  Pilot  School  using  both  white  and  black  hot 
polarity  settings  on  a  commercial  FLIR.  The  data  set 
contains  scatter  due  to  random  measurement  errors 
from  multiple  students  collecting  the  data.  This  data 
provided  a  challenging  learning  problem  for  the 
computer-generated  models  due  to  the  scatter.  In 
theory,  there  should  be  no  difference  between  the 
white  and  black  hot  settings.  However,  when  the 
computer-generated  model  was  created,  a  relationship 
was  found  between  the  polarity  setting  and  the  range. 
This  resulted  in  separate  predictions  for  the  two 
settings  in  the  computer-generated  model  as  shown  in 
figure  11.  As  can  be  seen  in  figure  11,  the  system 
demonstrated  less  detection  range  capability  in  the 
white  hot  setting.  This  was  captured  by  the  computer¬ 
generated  model  resulting  in  polarity  setting  as  an 
input  and  far  greater  accuracy  when  compared  with  the 
test  data  than  the  equation-based  model  prediction. 


•  Equation 
■  Computer  (White) 
Computer  (Black) 
X  White  Test  Data 
M  Black  Test  Data 


Effective  delta  T  (deg  C) 


Figure  11:  WFOV  FLIR  Model  Comparison 


There  is  no  mathematical  explanation  for  this  test 
result,  but  the  fact  that  it  exists  is  clearly  shown  in 
figure  1 1 .  Because  of  this  unusual  result,  electro  optic 
experts  from  the  Naval  Air  Test  Center  were  asked  to 
review  the  data.  They  verified  that  there  was  in  fact  a 
difference  between  the  polarity  settings  of  this 
particular  system,  attributing  the  difference  to  either  a 
display  unit  that  provides  a  better  display  of  dark  on  a 
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light  background  or  the  possibility  that  the  human  eye 
can  detect  black  on  white  better  than  white  on  black 
backgrounds. 

Discussion 

The  equation-based  models  presented  in  this  paper 
demonstrate  that  real  world  systems  rarely  perform  in 
accordance  with  theoretical  physics-based  equations. 
Equations  are  only  approximations  of  the  complexities 
of  the  real  world.  During  validation,  corrections  must 
be  made  to  the  model  equations  in  order  to  get  the 
predictions  to  match  the  real  world  data  within  the 
desired  accuracy.  This  may  include  additional  testing 
to  determine  the  source  of  the  difference  between  the 
equations  and  the  real  world  data.  By  continuing  to 
add  corrections,  the  equation-based  models  can  be 
made  as  accurate  as  the  computer-generated  models. 
However,  this  would  require  even  greater  human  labor 
further  increasing  the  time  advantage  and  cost  savings 
of  the  computer-generated  models  over  the  equation- 
based  models. 

By  comparison,  the  computer-generated  models  are 
capable  of  learning  the  relationships  between  the 
variables  including  the  many  non-linearities  and  other 
factors  that  are  not  captured  using  an  equation-based 
approach.  Additionally,  the  time  required  to  create  a 
model  using  this  technique  is  relatively  insensitive  to 
model  complexity.  Only  the  computer  run  time  while 
the  model  is  being  constructed  increases  significantly 
with  complexity,  adding  little  to  the  cost  of  model 
construction. 

The  authors  do  not  claim  that  computer-generated 
models  are  the  best  choice  in  every  case.  Each 
modeling  method  has  certain  advantages  depending  on 
specific  circumstances  of  what  is  being  modeled. 
Based  on  test  results,  the  following  circumstances 
favor  the  use  of  an  equation-based  approach: 

•  Validated  equation-based  models  already  exist 

•  Modeling  function  blocks  already  exist 

•  There  is  a  scarcity  of  available  data  on  what  is 
being  modeled 

•  The  element  being  modeled  does  not  require 
many  function  points 

The  following  circumstances  favor  a  computer¬ 
generated  Bayesian  network  approach 

•  Database  of  observed  or  test  data  already 
exists 

•  Problem  is  not  well  understood  and/or 
equations  do  not  exist 

•  Problem  is  complex 

•  There  may  be  unknown  non-linearities 


•  Hidden  variable  relationships  may  exist 

•  Problem  is  a  control  application  or  decision 
problem 

The  conditions  most  favorable  to  computer-generated 
models  are  those  with  the  greatest  potential  to  reduce 
the  time  of  construction  and  expense  of  modeling  and 
simulation. 

Model  Integration 

Equation-based  models  and  computer-generated 
Bayesian  network  models  are  not  mutually  exclusive 
methods  of  modeling  and  simulation.  When  modeling 
complex  systems,  the  problem  is  usually  broken  down 
into  smaller,  simpler  subsystems  that  are  constructed, 
tested  and  then  integrated  into  the  final  complex  model 
or  simulation.  This  approach  lends  itself  to  creation  of 
integrated  models  where  each  component  to  be 
modeled  is  individually  evaluated  to  determine  which 
modeling  method  would  be  best  under  the  particular 
circumstances.  For  the  research,  one  integrated 
simulation  was  the  detection  of  a  target  aircraft  by  the 
radar  model.  In  this  example,  the  equation-based  radar 
model  previously  described  was  corrected  for  the 
validation  data  resulting  in  a  good  match  between 
predictions  and  real  world  tests.  An  aircraft  target 
model  was  created  using  the  equations  of  motion  to 
control  target  movement  within  the  simulation.  The 
radar  cross  section  of  the  target  was  modeled  as  a 
computer-generated  Bayesian  network  from 
unclassified  radar  cross  section  measurements  of  a 
World  War  II  aircraft.  Construction  of  a  physics-based 
equation  model  for  aircraft  radar  cross  section  would 
probably  be  impossible  on  a  desktop  computer. 

The  radar  was  stationary  for  this  simulation.  The 
target  flies  a  closing  track  from  right  to  left  across  the 
front  of  the  radar.  The  target  tracking  simulation 
results  are  shown  in  figure  12.  As  can  be  seen  in  the 
target  track,  the  simulation  provides  an  extremely 
realistic  target  engagement.  As  the  target  aircraft 
moves  toward  the  radar,  both  the  range  and  aspect  of 
the  aircraft  change.  This  causes  scintillation,  where  the 
target  fades  in  and  out  between  different  radar  scans. 
Those  with  radar  operating  experience  will  recognize 
this  real  world  phenomenon  on  the  track  of  figure  12. 
In  addition  to  this  example,  several  other  integrated 
equation/computer-generated  models  were  constructed 
and  tested.  These  models  offered  extremely  improved 
flexibility  to  the  model  builder  based  on  the  attributes 
of  the  particular  sub  element  being  created. 
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Figure  12.  Radar  Target  Tracking  Simulation 

Of  particular  note  were  models  constructed  using  the 
computer-generated  Bayesian  network  models  to  make 
human  decisions  or  otherwise  perform  control 
functions  within  the  simulations.  These  integrated 
simulations  were  shown  to  be  more  effective  than 
those  using  a  rule-based  approach  to  decision-making 
or  control. 

SIGNIFICANT  CONTRIBUTION 

The  primary  contribution  produced  by  this  research  is 
the  demonstration  that  highly  complex  model  elements 
can  be  created  directly  from  data  sets  in  a  small 
fraction  of  the  time  required  to  build  the  same  model 
using  a  manual,  equation-based  method.  Since 
modeling  costs  are  primarily  driven  by  human  labor, 
this  research  demonstrates  that  significant  cost 
reductions  are  possible.  Not  only  were  models  created 
in  far  less  time,  but  in  every  case  the  computer¬ 
generated  models  were  more  accurate  than  the 
equation-based  models  prior  to  corrections  for  model 
validation.  The  computer-generated  models  also 
quantify  the  accuracy  of  the  prediction  where  as  the 
equation-based  models  must  be  run  many  times  to 
obtain  the  same  distributions.  We  also  demonstrated 
that  computer-generated  models  can  be  integrated  with 
equation-based  models  providing  never  before  seen 
flexibility  in  model  element  creation  along  with  reuse 
of  existing  model  assets.  Additionally,  the  research 
can  improve  training  simulations  through  construction 
of  computer-generated  models  of  the  actions  of  an 
adversary.  By  integrating  the  model  into  a  training 
simulation,  human  trainees  would  be  exposed  to 
training  scenarios  that  respond  to  their  actions  much 
more  like  the  adversary  would  respond.  Not  only 
would  these  models  respond  much  more  like  humans 
than  rule-based  models,  but  could  be  rapidly  and 
inexpensively  updated  as  new  information  becomes 
available. 


CONCLUSIONS 

Modeling  and  Simulation  is  an  important  tool  in  the 
development  of  the  highly  effective  weapons  systems 
built  by  the  United  States  and  its  allies.  It  has  been 
demonstrated  that  the  benefits  of  M&S  are  applicable 
to  programs  of  any  size.  However,  M&S  generally 
requires  a  significant  upfront  investment;  one  that 
smaller  programs  cannot  afford  to  make.  This  research 
has  demonstrated  that  it  is  possible  to  significantly 
reduce  M&S  costs  making  this  tool  affordable  for 
small  programs  and  more  cost  effective  for  large  ones. 
Based  on  labor  costs,  the  reduction  demonstrated  in 
this  research  is  approximately  90%  while 
simultaneously  achieving  a  70%  increase  in  predictive 
accuracy.  The  authors  acknowledge  that  computer¬ 
generated  Bayesian  network  models  are  not  the  optimal 
choice  for  every  situation.  We  demonstrate  that 
integrated  models  can  be  constructed  using  both 
equation-based  and  computer  generated  models. 
Together,  these  integrated  models  and  simulations 
provide  tremendous  flexibility  to  the  model  developer. 

Work  continues  to  further  improve  this  process.  The 
method  of  discretization  has  currently  been  updated  to 
handle  any  data  set.  Future  plans  include  automation 
of  the  neural  network  build  process  and  integration  of 
the  structural  learning  program  into  the  main  software 
application.  These  improvements  are  designed  to 
further  reduce  human  labor  to  only  a  few  minutes, 
irregardless  of  the  size  or  complexity  of  the  model. 
Additionally,  we  are  continuing  to  explore  potential 
applications  of  creating  more  complex  integrated 
models  mixing  model  types.  The  ability  to  mix 
predictive  artificial  intelligence  decision  making  into 
physical  simulations  could  have  exciting  applications 
in  the  areas  of  training  and  operations  research. 
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